Novel chaotic oppositional fruit fly optimization algorithm for feature selection applied on COVID 19 patients’ health prediction

The fast-growing quantity of information hinders the process of machine learning, making it computationally costly and with substandard results. Feature selection is a pre-processing method for obtaining the optimal subset of features in a data set. Optimization algorithms struggle to decrease the dimensionality while retaining accuracy in high-dimensional data set. This article proposes a novel chaotic opposition fruit fly optimization algorithm, an improved variation of the original fruit fly algorithm, advanced and adapted for binary optimization problems. The proposed algorithm is tested on ten unconstrained benchmark functions and evaluated on twenty-one standard datasets taken from the Univesity of California, Irvine repository and Arizona State University. Further, the presented algorithm is assessed on a coronavirus disease dataset, as well. The proposed method is then compared with several well-known feature selection algorithms on the same datasets. The results prove that the presented algorithm predominantly outperform other algorithms in selecting the most relevant features by decreasing the number of utilized features and improving classification accuracy.


Introduction
Information and data are at the core of technological evolution. With the progress of technology, extensive datasets have improved the machine learning models in numerous domains but made the analysis of said datasets remarkably strenuous, considering that surplus, noisy and irrelevant data is abundant within the sets. That abundance of inconsequential data hinders the machine learning process, making it computationally expensive, frequently resulting in substandard performance and accuracy of the model. Metaheuristic algorithms are exceptionally effective optimization methods, particulary in tackling demanding, high-dimensional issues. Imputable to their superb performance, researchers utilize metaheuristic algorithms to resolve feature selection problems. Eminent nature-inspired metaheuristic algorithms include evolutionary algorithms (EA), inspired by biological evolution (reproduction, mutation, recombination, and selection), and swarm intelligence (SI), which mimic the behavioural patterns of animals in a herd since they show substantial collective intelligence compared to the one of each individual.

Machine learning and feature selection
The focal objective of machine learning is successful output prediction of the algorithm for each input through experience [1]. Machine learning differentiates two types of scenarios: supervised and unsupervised. The one used in this manuscript-supervised learning [2], utilizes labelled datasets to train algorithms for accurate predictions of outcomes or data classification. A copious amount of data within datasets is what compels the machine learning model. Simultaneously, these large datasets, packed with redundant and inessential data, influence the machine learning process in regard to accuracy and computational complexity. Frequently, the said datasets are high-dimensional, which impedes the performance of the machine learning model, as well. This occurrence refers to the curse of dimensionality [3].
Hence, identifying essential information is crucial to tackling this issue. For this reason, the technique of dimensionality reduction [4], an action of reducing classification variables, is a main pre-processing task for machine learning. There are two approaches to dimension reduction: feature extraction and feature selection (FS). While feature extraction [5] generates new variables derived from the primary set of data, FS selects a subset of relevant informative variables for desired objective.
The purpose of FS is to determine the relevant subset from high-dimensional data sets eliminating the insignificant features, thus enhancing the classification accuracy for machine learning. There are three feature selection methods: filter, wrapper and the embedded methods, as per G. Chandrashekar et al. [6].
Wrapper methods utilize learning algorithms, like classifiers, to evaluate feature subsets to find relevant features. Wrapper methods yield the best-performing feature subsets (smaller subsets and higher classification accuracy) but are computationally demanding.
Filter methods do not use a training process and instead designate a score to feature subsets using a measure. Due to that, this method is not as computationally demanding as the wrapper and creates a universal set (unadjusted to a specific prediction model).
The embedded method employs FS as a segment of the model-creating process, i.e. methods perform FS during the model training. These methods are more accurate than filters, with the same execution speed. With computational complexity in mind, embedded methods are in between the methods mentioned above.

Paper goal and structure
One might wonder whether a new optimization method is needed, considering that there are numerous optimization algorithms in studies that carry out the task rather well.
The No Free Lunch (NFL) theorem [7] demonstrates that none of the algorithms can resolve all optimization issues. Meaning, present-day algorithms for feature selection are not capable of solving all feature selection issues. This inspires researchers to enhance and adapt existing algorithms or present new algorithms, to cope with a wide range of problems.
Optimization algorithms are coping with providing optimal informative subsets within high-dimensional data sets. Therefore, a new metaheuristic algorithm is demanded to enhance resolving feature selection problems.
This manuscript proposes a well-known swarm intelligence metaheuristic, fruit fly optimization algorithm (FFO), improved and adapted for solving FS problems in a wrapper-based approach. The goal of the presented research in this paper is to enhance solving feature selection problems with a proposed chaotic oppositional fruit fly optimization (COFFO) algorithm by obtaining high classification accuracy on different datasets. COFFO is tested on ten unconstrained benchmark functions (CEC2019), then on 21 standard datasets taken from the Univesity of California, Irvine (UCI) repository and Arizona State University (ASU), along with the coronavirus disease (COVID-19) dataset. Additionally, the proposed method is compared with several well-known feature selection algorithms on the same datasets. The results prove that the presented COFFO predominantly outperform other algorithms in selecting the most relevant features.
This following research questions inspire this work: • Is it achievable to further improve the original FFO algorithm for high-dimensional feature selection problems?
• Is it attainable to further improve the solving of FS problems with the proposed COFFO by enhancing the accuracy and selecting features with a higher impact on the target variable?
The contributions of this research are summarized as follows: • The proposal of COFFO, an upgraded variant of the original FFO, is suitable for solving even high-dimensional FS problems.
• This robust method is implementing chaotic behaviour and opposition-based learning to improve population diversity and exploratory capacity of FFO.
• After extensive testing of the proposed method and comparing it with other well-known feature selection algorithms, the conclusion is that solving of FS problems is furthermore improved.
• Implementing FFO and COFFO algorithms in COVID-19 patient health prediction is a beneficial contribution to medicine.
The structuring of this paper is as follows. Section 2 provides a brief overview of swarm intelligence algorithms and their applications in various fields.Section 3 presents the basic fruit fly optimization algorithm, summarizes its downsides before proposing an improved variation of this promising algorithm. Sections 4 and 5 present results of the presented aproach, as well as a comparison with other well-known methods for standard CEC2019 benchmarks and then for twenty-one standard datasets taken from the UCI and ASU. Section 6 displays the application of COFFO on COVID-19 datasets. Section 7 discusses advantages and disadvantages of COFFO. Lastly, Section 8 draws conclusions and future directions.

Related works
Nature-inspired metaheuristic algorithms have shown high efficiency in solving numerous optimization problems and, as such, are in the lead as of recent apropos solving complex realworld problems. Metaheuristic algorithms enable attaining suboptimal solutions in a reasonable time frame.
Various problems in diverse disciplines have benefited from SI problem-solving solutions, such as medical applications for diagnosing serious diseases in early stages [15] or the COVID-19 cases predictions [16], problems with optimization of artificial neural network parameters [17][18][19][20], the management and normal functioning of wireless sensor networks [21][22][23] up to resolving issues in cloud computing [24][25][26]. The paper [27] offered an extensive analysis of metaheuristics for the feature selection problem.
Apropos COVID-19 patient diagnostic, paper [28] proposed a hybrid FS method to find optimal subset of features obtained from the chest computed tomography images. Research [29] introduced a deep network model to pinpoint the COVID-19 disease built on X-ray images. Relief-based FS algorithm suggested in [30], is used to filter the unnecessary features in COVID-19 prediction.
Multiple swarm intelligence algorithms are employed to solve the feature selection problem [31][32][33]. For that purpose, copious binary metaheuristic methods are created, predominantly for wrapper-based FS. The two essential terms, transfer function (TF) and binarization, are utilized. Binary particle swarm optimization (BPSO) is presented in [34]to resolve discrete problems. Dragonfly algorithm (DA) [35], created to solve continuous optimization problems by simulating the swarming patterns of a dragonfly, got its binary version BDA [36], which utilizes transfer functions that differ in time. Heavy exploitation of BDA can produce a local optima problem, thus failing to obtain the global optimal solution. An improved version, the hyper learning binary dragonfly algorithm (HLBDA) [37], uses the hyper learning strategy enabling the dragonfly to learn from both personal and global best solutions throughout the search phase. Research [38] presented a binary artificial bee colony (BABC) established on the Jaccard coefficient dissimilarity, but the method has a complex structure. A binary version of a grasshopper optimization algorithm (BGOA) [39] employs sigmoid and V-shaped transfer functions and has an integrated mutation operator to improve the diversification stage.

Proposed method
First, the original fruit fly optimization algorithm is introduced, followed by the proposed hybrid method for feature selection problem.

Original fruit fly optimization algorithm
Fruit fly optimization algorithm, proposed by Prof. Pan [40,41], is a somewhat new natureinspired optimization algorithm. In contrast to other metaheuristic algorithms, FFO is easy to comprehend and apply, thanks to the simple computational operation.
This method is an auspicious swarm intelligence algorithm motivated by the knowledge of the foraging behavioural patterns of fruit flies. The fruit fly surpasses other species relating to vision and olfaction, on which they predominantly rely-fruit flies can gather miscellaneous aerial smells, despite the source of food being far away. Throughout the scouring activity, fruit flies scout and locate food sources surrounding the swarm and estimate the smell concentration for each food source. When the best location with the highest smell concentration is detected, the swarm navigates towards it.
Undeniably, the process of effective communication and teamwork among individual fruit flies is essential to accomplishment in the tactics of solving an optimization problem. The algorithm contains four phases: • initialization, • osphresis foraging, • population evaluation, • vision.
Initially, the parameters are set-the maximum number of iterations and population size. The solutions, i.e. fruit flies, are initiated randomly (1) where X i,j implies i-th solution and j denotes the element's position in the i-th solution. LB represents lower bound, while UB represents an upper bound, and rand is a random number from the uniform distribution. Then, the position update of each solution occurs in accordance with the osphresis foraging phase. The solutions are distributed randomly from the current location, formulated in (2) where X ðtþ1Þ i;j represents the new position, X ðtÞ i;j represent current solution, rand() 2[−1, 1], while t denotes the iteration counter. Following the position update, distance and smell are calculated. Then, the computation of smell concentration-the function of smell (fitness function), for each solution, ensues. If the solution's new best fitness function value is better than the previous best, then the solution's new location with the best fitness function value will replace all solution's positions. Otherwise, the old solution's location will remain. This process represents the vision foraging phase of the algorithm. The algorithm continues until satisfying the stopping criteria and yields the best solution.

Motivation for improvement and proposed chaotic oppositional fruit fly optimization algorithm
The adaptation of a FFO algorithm to a particular problem is uncomplicated since its somewhat simple configuration. Notwithstanding the good performance of basic FFO [40,41], by performing extensive practical simulations on a wide range of benchmark instances from Congress on Evolutionary Computation (CEC), it was observed that the basic FFO can be further improved.
Namely, basic FFO in some runs, due to stochastic nature, exhibits not so good exploration ability, because it performs fixed position update strategy and can be easily stuck in the local optima. Moreover, it was suggested that its exploitation capabilities can be further enhanced.
Method proposed in this study addresses above mentioned drawbacks by implementing opposition-based learning (OBL) and chaotic behavior in the original FFO approach. Inspired by the proposed modifications, method showed in this study is named chaotic oppositional fruit fly optimization (COFFO) algorithm.
The OBL was introduced for the first time in 2005 by Tizhoosh [42] and it was proved that this mechanism can substantially improve exploration and exploitation abilities of metaheuristics method [42,43].
The OBL mechanism is mathematically described as follows: let x j denotes j-the parameter of solution x and the x o j represents its opposite number. The opposite number of j-th parameter of individual x calculates as follows: where x j 2 [LB j , UB j ] and LB j , UB j 2 R, 8j 2 1, 2, 3, . . .D. Parameter D represent the number of solution dimensions (parameters). In complex implementations, the imbalance between exploitation and exploration and the randomness of the initialization phase causes the entrapment of optimization algorithms in the local optima. Literary manuscripts propose chaos theory as one of the methods for resolving this issue. Chaos optimization algorithm (COA) [44] is an example of chaos implementation that exploits the nature of the chaotic structure. Classification performance can be improved by applying chaotic system rather than the random parameter values [45]. Examples of these implementations are the following: chaotic whale optimization algorithm (CWOA) [46], chaotic grey wolf optimization (CGWO) [47] and chaotic grasshopper optimization algorithm (CGOA) [48].
Chaos represents a non-linear occurrence of a dynamic but deterministic system with stochastic patterns that is exceedingly receptive to its initial conditions. Although multiple chaotic maps exist, experimental testing shows that the logistic map provided the best results with the introduced COFFO. Chaotic-based search strategy implementation in the presented COFFO is generated by the chaotic sequence in line with the limitations of a specific problem. When the sequence is created, individuals employ it to explore the search space. The COFFO uses chaotic sequence β, which starts from arbitrary initial number β 0 created by the logistic mapping. Logistic map executes in K steps in a following way: where b k i;j and b kþ1 i;j denote chaotic variable for j-th component of the i-th solution in steps k and k+ 1, respectively, while μ denotes chaotic control parameter. The μ typically has the value 4 [49], a value used in this work as well, to guarantee chaotic behaviour of individuals, the β i,j 6 ¼ 0.25, 0.5 and 0.75 and σ i, j 2 (0, 1).
Action of mapping solutions onto generated chaotic sequences is achieved with following formulation for each component j of individual i: where X c i is the new location of individual i after chaotic disruptions. To establish an initial population of high quality, proposed COFFO first incorporates chaotic-opposition-based initialization, which is shown in Algorithm 1.

Algorithm 1 Chaotic-opposition-based initialization pseudo-code
Step 1: Generate standard random population P of N solutions with expression: Step 2: Generate opposition population P o for first N/2 individuals by triggering OBL using Eq (3) Step 3: Generate chaotic population P c of N/2 individuals by mapping solutions from P to chaotic sequences using expressions (4) and (5).
Step 4: Calculate fitness of all solutions from P, P o and P c .
Step 5: Sort all individuals from P [ P o [ P c according to fitness.
Step 6: Select N best individuals from sorted set P [ P o [ P c for initial population.
In this way, initial population P is closer to optimum region of the search space and the COFFO can utilize more iterations for performing exploitation and exploration in this region.
However, despite of novel initialization strategy, exploitation ability in later cycles should also be improved and for this reason, COFFO incorporates chaotic local search (CLS) strategy which is executed around the current global best (X � ) solution. Throughout every step k, new X � , represented as X 0 � , is created by applying Eqs (6) and (7), for each component j of X � : where b k j is calculated by Eq (4), while λ is a dynamic shrinkage parameter, depending on the maximum number of fitness function evaluations (maxFFE) and the current fitness function evaluation (FFE) in the algorithm's execution: The use of dynamic λ allows for a better exploitation-exploration equilibria to be built around the X � . Earlier stages of execution explore a larger search area around the X � , while later stages emphasize on fine-tuned exploitation. Alternatively, the maxFFE and FFE can be replaced with T and t when the maximum number of iterations is considered as the termination condition.
In that manner, utilization of the CLS strategy is an attempt to enhance X � in K steps. If the X 0 � achieves better fitness value than the X � , then the CLS procedure terminates and the X 0 � replaces X � . Nevertheless, if X � cannot improve in K steps, it remains in the population.
Again, by conducting empirical experiments with CLS, it was observed that this mechanism should not be triggered too early. If it is executed in early iterations, when the search process did not converge enough, many FFEs are wasted. For that reason, additional control parameters, CLS trigger (clst) is incorporated that determines whether or not the CLS around X � will be executed. The value of this parameter is determined empirically, as it is shown in Section 4.
Taking all into consideration, workings of proposed COFFO are summarized in Algorithm 2.

Algorithm 2 Proposed COFFO pseudo-code
Generate initial population according to Algorithm 1 Set the FFEs to 0 and define the termination criteria (maxFFEs) Evaluate the fitness of each individuals while FFEs < maxFFEs do for i = 1 to N do Update the position according to FFO updating mechanism by Eq (2) end for Determine the X � solution if FFEs > clst then Perform CLS strategy by using Eqs 6 and 7 Adjust λ by applying expression 8 end if end while Return the X � solution Complexity in metaheuristics is measured by the number of FFEs, as the FFE is the most demanding operation. For the suggested algorithm, it can be calculated in a following way: where N is the number of solutions in a population, T is the maximum number of iterations. This equation stands in a worst-case scenario, i.e. in each iteration, a chaotic local search is executed, and one solution evaluated. maxFFE is used as a termination condition in simulations for unbiased comparative analysis, even if the proposed algorithm uses more FFEs than some algorithms in each iteration.

Simulation and comparative analysis for unconstrained functions
First, the presented approach is substantiated on unconstrained benchmark functions. Ten CEC2019 functions [50] are utilized to validate the performance of the presented method, before applying it to a real-world task. The original FFO and nine other metaheuristic-based algorithms: elephant herding optimization (EHO) [51], EHO improved (EHOI) [52], sine cosine algorithm (SCA) [53], salp swarm algorithm (SSA) [14], grasshopper optimization algorithm (GOA) [12], moth-flame optimization (MFO) [54], particle swarm optimization (PSO) [13], whale optimization algorithm (WOA) [9], biogeography-based optimization (BBO) [55] are tested on ten recent benchmark function set, presented on the Congress on Evolutionary Computation 2019 (CEC2019) [50], under similar circumstances. Additionally, the existing PSO embedded with chaotic opposition-based initialization (COPSO) is added for a more comprehensive comparative analysis. These results are then compared to the results gained by the presented algorithm.
The CEC2019 bound-constrained benchmark function characteristics are given in Table 1 Research paper [52] provides the simulation results of previously mentioned algorithms for the same benchmarks. The same experiments are conducted anew to corroborate results from [52] and from an unbiased comparative analysis. Control parameters used to test methods in [52], population size N = 50 and a maximum number of iterations maxIter = 500, might prompt a very biased comparative analysis considering not all algorithms use the same number of fitness function evaluations (FFEs) in one iteration. In the initialization phase, most of these algorithms use N evaluations and then, in every iteration for each individual in the population, execute one more FFE. Hence, the termination condition maxFFE = N + N � maxIter is set to 25, 050 for all methods. That way, the same experimental conditions are established as in the [52], and the comparative analysis is unbiased.
Other parameters are set as follows: the size of the population is fixed at N = 50 and the clst expression was empirically determined as maxFFEs/3, which is in this case 8,350. This experiment is redone in 30 independent runs. Table 2 shows the control parameters for COFFO used throughout the unconstrained benchmark function experiment.
Control parameters for metaheuristics, used in this comparative analysis, were set as suggested in the original manuscripts. Table 3 displays the gained experimental results-corresponding mean values and standard deviations of the presented and comparable methods. The best mean value is displayed in bold style for every benchmark instance, while the best standard deviation value is in italic, for

PLOS ONE
Novel chaotic oppositional fruit fly optimization algorithm for feature selection applied on COVID 19 easier reading. The obtained results of EHO, EHOI, SCA, SSA, GOA, MFO, PSO, WOA and BBO are slightly different from the results in the paper [52] due to the stochastic nature of observed algorithms. From the results in Table 3, it is apparent that the presented method outperformed other tested algorithms. COFFO has the best mean value regarding eight functions (CEC1, CEC2, CEC3, CEC4, CEC5, CEC7, CEC8 and CEC9). The original FFO achieved the best mean value on CEC6 test instance, followed by COFFO. EHOI performed best on function CEC10, marginally in front of COFFO. COPSO obtained a better mean fitness value than the original PSO on seven functions due to chaotic opposition-based initialization. The new COFFO is prominent on CEC1 and CEC2 test instances in comparison to other algorithms. When comparing various algorithms, contemporary computer science theory requires a statistical validation of the significance of improvements. The Friedman test [56,57], a two-way variance analysis by ranks, demonstrates the considerable distinction between the proposed and other tested methods. Table 4 displays the ranking of twelve algorithms applied on ten functions.
COFFO's average ranking for the Friedman test is 1.20, thus demonstrating its superiority over the ten remaining algorithms (Table 4). At the significance level α = 0.005, the Friedman statistics (w 2 r ¼ 60:2) is greater than the χ 2 critical value (χ 2 = 19.7); hence the null hypothesis (H 0 ) is rejected, allowing the conclusion that COFFO is substantially distinct from the rest of the compared methods.
Furthermore, Iman and Davenport's test [58] is conducted since, as per [59], it can be more precise than the approximation of chi-square. The summary of the statistical results is given in Table 5.
The F-distribution critical value (1.89) is less than the gained Iman-Davenport statistic of (10.9), so the second test rejects H 0 as well. The significance level is greater than the p−value in both tests, as presented in Table 5.
Since both tests reject the null hypothesis, Holm's step-down procedure, as a post-hoc procedure, is conducted with its results displayed in Table 6.
The presented algorithm substantially surpassed ten out of eleven compared methods at significance level α = 0.1, with nine out of eleven at significance level α = 0.05.
In addition, a quad test [60] for the average fitness function is conducted, and the obtained F value is 8.53, while the p−value is 2.06E−10.
It can be concluded that the COFFO algorithm enhance the performance of the original FFO metaheuristic, thus affirming the goal of proposing an improved FFO algorithm.
Next, Fig 1 displays convergence speed graphs for some algorithms. The best three, COFFO, FFO and EHOI, are emphasized in these graphs with different line styles. These show that the proposed COFFO algorithm has "starting adventage", since its initialization utilizes chaotic sequences and opposition-based learning. Meaning, it has a better initial population then other algorithms, making the search easier.

Feature selection simulation results
The presented algorithm is substantiated on 21 standard datasets, collected from the UCI repository [61] and Arizona State University [62]. These feature selection datasets include: colon, arrhythmia, primary tumor, ILPD, ionosphere, leukemia, dermatology, zoo, glass, SCADI, SPECT heart, horse colic, libras movement, lung discrete, musk1, TOX 171, soybean, seeds, lymphography, LSVT and hepatitis. Details regarding datasets (number of features and training samples, dimensions)can be retrieved from [37]. Utilized datasets are devised of a diverse number of dimensions and features, of which leukemia and TOX 171 have the highest dimensionality, with 7070 and 5748 features respectively. Therefore, the performance of the presented algorithm is evaluated on disparate constructions that illustrate its effectiveness in divergent dimensions [63]. Five evaluation measures are determined to assess the performance of the algorithm. These measures represent the following: the best fitness value, the standard deviation of fitness value, the mean fitness value, feature selection ratio and classification accuracy.

Fitness evaluation and experimental conditions
The purpose of the fitness function is to estimate the quality of the solutions. Iteratively, every fruit fly is assessed by applying a fitness function. The fitness function in this research is selected to maximize classification accuracy and minimize the number of selected features. The fitness function is as follows [37]: where ER is the classification's error, |S| is the length of the subset of selected features, and |O| is the length of original features. Two weight infectors, α 2 [0, 1] and β = (1−α), are used to indicate the influence of classification error and feature size on the fitness function. In COFFO, the transfer function is utilized to adapt the algorithm for binary problems. S-shaped and V-shaped transfer functions, named after the shape of the TF curve, are tested. V-shaped TF provided the best results and therefore implemented in the presented method. Table 7 provides the mathematical formulation of V-shaped transfer functions.
Modelled on the paper [37], the dataset is divided into the training and evaluation set utilising the stratified 10-fold cross-validation method. For wrapper-based FS, the K nearest neighbour (KNN, k = 5) is used to calculate classification error. The benefits of using KNN as a learning algorithm are its simplicity and low computational cost. All methods are conducted in 20 independent runs due to the non-deterministic nature of optimization algorithms. The averages of results are collected. The maximum number of iterations maxIter = 100 can produce a biased comparative analysis since number of utilized FFEs per iteration can vary between algorithms. Thus, the termination condition maxFFes = N+ N � maxIter is set to 1010. The population size is N = 10.

Comparison with other feature selection methods
This subsection provides the comparative performance analysis between the presented algorithm and eleven eminent algorithms: HLBDA, binary dragonfly algorithm (BDA) [35], binary multiverse optimizer (BMVO) [64], binary artificial bee colony (BABC) [65], binary particle swarm optimization (BPSO) [34], success-history based adaptive differential evolution with linear population size reduction (LSHADE) [66], chaotic crow search algorithm (CCSA) [45], evolution strategy with covariance matrix adaptation (CMAES) [67], binary coyote optimization algorithm (BCOA) [68,69], COPSO and FFO. Table 8 shows the parameters for the compared algorithms. The personal learning rate (pl) and global learning rate (gl) of HLBDA are set to 0.4 and 0.7, respectively. The maximum limit for BABC is set at 5. The wormhole existence probability (WEP) increases from 0.02 to 1 whilst the traveling distance rate (TDR) decreases from 0.6 to 0-both in BMVO. In BPSO, acceleration factors are set at 2 and the inertia weight is decreasing from 0.9 to 0.4. In CCSA the awareness probability (AP) and flight length (fl) are set at 0.1 and 2, respectively. In BCOA, the number of coyotes and packs are set to 5 and 2. The number of parents for CMAES is set at 25% of solutions. When it comes to LSHADE, the memory size and minimum population size are set at 5 and 4. Table 7. V-shaped transfer functions.

Name
Transfer function Tables 9-11 show testing results of mean fitness, the best fitness, and the standard deviation of fitness function for the presented COFFO. As shown in Table 9, COFFO identified the optimal best fitness value on fifteen datasets, accompanied by HLBDA in eight datasets.
Results in Table 10 display that COFFO detected the optimal mean fitness value in fourteen datasets, followed by HLBDA with four. These results entail that the presented COFFO can locate the optimal feature subset in most cases, yielding a satisfying performance.
As shown in Table 11, COFFO discerned the lowest standard deviation in twelve datasets, accompanied by BABC with four. COFFO consistently obtained better results compared to FFO. Fig 2 provides the classification accuracy result of tested algorithms. As demonstrated, COFFO obtained the highest accuracy in 12 datasets, exceeding the remaining algorithms in procuring the optimal feature subset.
Boxplot is a type of chart often used in explanatory data analysis. It shows minimum score (the lowest score, excluding outliers), lower quartile (25% of scores fall below the lower quartile value), median (marks the mid-point of the data), upper quartile (75% of the scores fall below the upper quartile value), maximum score (the highest score, excluding outliers), whiskers (represent scores outside the middle 50%) and the interquartile range (IQR) (box plot displaying the middle 50% of scores). The average error rate was taken for all 21 datasets from which the boxplots analysis is conducted, to exhibit the stability, i.e. diversification of the proposed algorithm. Fig 3 provides the boxplots analysis of eleven different algorithms. As seen in Fig 3, the presented COFFO is relatively stable, and in comparison to second best HLBDO and original FFO, has smaller IQR, that is, lower dispersion and the best maximal score. The gained results  uphold the effectiveness of the proposed algorithm in maintaining the highest classification accuracy. Table 12 shows the feature selection ratio results. The length of the optimal feature subset obtained by algorithms is proportional to the feature selection ratio-the smaller the subset is, the lower the ratio. The results display that COFFO attained the smallest feature size in thirteen datasets, accompanied by HLBDA with seven. Compared to other algorithms, COFFO can frequently find a small, most informative subset of features. Indisputable, COFFO is efficient in selecting the best feature selection solution and preventing the local optima.
For the statistical analysis, the Wilcoxon signed-rank test [70] is conducted for COFFO comparison against other methods. If the p−value is smaller than 0.05, then the classification accuracy of the two compared methods is significantly different. Table 13 displays the results of the Wilcoxon test of COFFO as opposed to other methods. The results acquired prove that COFFO's classification performance is significantly better than the remaining candidates in all cases except for HLBDA.
Particular emphasis should be placed on COFFO's performance in high-dimensional datasets, such as TOX171 and Leukemia. Experimental results indicate that the proposed approach is more effective in selecting relevant features than the original FFO and other tested methods.
For extensive analysis, error rate convergence graphs of COFFO, FFO and six more methods on eight datasets are provided in Fig 4. The introduced COFFO generated the best initial population on six out of eight datasets, thus showing a considerable advantage of chaotic-based and opposition-based learning implementation. BPSO obtained the best results in its initial phase on a Dermatology dataset, while all tested algorithms gave a similar performance on the Colon dataset. The proposed COFFO is drastically better when generating the initial population than the original FFO in most datasets.

COVID-19 dataset and results
What started as an acute respiratory syndrome outbreak in China quickly became a pandemic. The SARS-CoV-2, also called COVID-19, has caused the deaths of millions of people worldwide since its beginning [71,72]. Artificial intelligence can help with the prevention, detection and diagnosis of COVID-19 [73]. This section displays the implementation of the proposed algorithm in COVID-19 patient health prediction. The dataset of COVID-19 cases was gathered from the [74]. Table 14 shows fifteen features contained within the said dataset. The aim is to predict the death and recovery conditions determined by specific factors. Solely the data containing values for "death" and "recov" status are considered. For validation, the data is divided equally into two disjunct sets-training and testing. Each feature has a numeric form assigned to it.
As Table 15 shows the proposed COFFO has optimal mean fitness value, best fitness value and feature selection ratio value, followed by HLBDA. Fig 5 demonstrates the average accuracy and selected feature size of all compared algorithms tested on the COVID-19 dataset. COFFO outperformed the original FFO and other algorithms by attaining the average classification accuracy of 92.46% and the smallest feature size of 2.29. According to the collected data, the most selected features were gender, age and symptom2. On the other hand, id and symptom6 were never selected by COFFO algorithm.
The results indicate that these features are ineffective in discerning the data patterns in patient health prediction procedure. The accuracy of patient health prediction can be more precise in the future by gathering additional clinical features.

Discussion
The results illustrate that COFFO has shown the best performance in selecting relevant features while significantly reducing dimensionality. The improvement is reflected in both exploration and exploitation stages. Due to the fixed position update strategy, the original FFO can get stuck in the local optima in its exploration phase. To solve this problem, opposition-based learning and mapping solutions to generating chaotic sequences have been implemented, thus achieving an initial population that is closer to an optimum region of the search space and accelerating convergence towards the optimal global solution in a complex feature space. Further, the exploitation phase has been improved with a chaotic local search strategy for fine-tuned exploitation. The disadvantage of implementing  chaotic opposition-based learning in the initial phase, and chaotic local search in the exploitation phase, is the increase in time complexity.

Conclusion
This study presents a novel chaotic oppositional fruit fly optimization algorithm (COFFO), a wrapper-based technique for feature selection. The COFFO employs chaotic-based and opposition-based learning to improve the performance of the original algorithm. With the current praxis in mind regarding the optimization process, the introduced algorithm is tested on ten unconstrained benchmark functions from CEC2019. For comparative analysis, eleven other well-known metaheuristic methods are tested under the same experimental conditions. The mean fitness and standard deviation are compared between tested algorithms, and, additionally, statistical tests are conducted, which prove that COFFO outperforms all the other tested methods. Further, the proposed approach outperforms the original FFO significantly. The next phase centres on applying COFFO to 21 standard datasets. For performance comparison, eleven other well-known approaches are tested under the same experimental conditions. The best fitness value, the mean fitness value, standard deviation, accuracy and feature selection ratio are used for comparison. Wilcoxon statistical test is conducted, as well, for testing the proposed COFFO against other methods. In all the above-noted datasets, COFFO outscored tested algorithms in most cases, specifically on high-dimensional feature sets.
Finally, COFFO is employed in COVID-19 patient health prediction, where the introduced algorithm achieved excellent performance surpassing preceding algorithms. Among the peers, especially as opposed to the original FFO, COFFO can select a subset of significant features with high discriminatory capacities. Taking everything into account, the presented COFFO not only obtains the highest classification accuracy but is also effective in dimensionality reduction.
As part of the future research proposed COFFO can be tested on various NP-hard optimization challenges from domains such as cloud computing, wireless sensor networks, portfolio optimization and also applied for enhancing machine learning models.